Goto

Collaborating Authors

 language model behavior


Improving Language Model Behavior by Training on a Curated Dataset

#artificialintelligence

We've found we can improve language model behavior with respect to specific behavioral values by fine-tuning on a curated dataset of 100 examples of those values. We also found that this process becomes more effective as models get larger. While the technique is still nascent, we're looking for OpenAI API users who would like to try it out and are excited to find ways to use these and other techniques in production use cases. Our approach aims to give language model operators the tools to narrow this universal set of behaviors to a constrained set of values. While OpenAI provides guardrails and monitoring to ensure that model use-cases are compatible with our Charter, we view selecting the exact set of Charter-compatible values for the model as a choice that our users must face for their specific applications.


OpenAI claims to have mitigated bias and toxicity in GPT-3

#artificialintelligence

In a study published today, OpenAI, the lab best known for its research on large language models, claims it's discovered a way to improve the "behavior" of language models with respect to ethical, moral, and societal values. The approach, OpenAI says, can give developers the tools to dictate the tone and personality of a model depending on the prompt that the model's given. Despite the potential of natural language models like GPT-3, many blockers exist. The models can't always answer math problems correctly or respond to questions without paraphrasing training data, and it's well-established that they amplify the biases in data on which they were trained. That's problematic in the language domain, because a portion of the data is often sourced from communities with pervasive gender, race, and religious prejudices.